Mean Field Markov Decision Processes
نویسندگان
چکیده
Abstract We consider mean-field control problems in discrete time with discounted reward, infinite horizon and compact state action space. The existence of optimal policies is shown the limiting problem derived when number individuals tends to infinity. Moreover, we average reward show that policy this limit $$\varepsilon $$ ε -optimal for if large discount factor close one. This result very helpful, because it turns out special case does only depend on distribution individuals, obtain a interesting subclass where an can be obtained by first computing measure from static optimization then achieving Markov Chain Monte Carlo methods. give two applications: Avoiding congestion graph positioning market place which solve explicitly.
منابع مشابه
Mean-Variance Optimization in Markov Decision Processes
We consider finite horizon Markov decision processes under performance measures that involve both the mean and the variance of the cumulative reward. We show that either randomized or history-based policies can improve performance. We prove that the complexity of computing a policy that maximizes the mean reward under a variance constraint is NP-hard for some cases, and strongly NP-hard for oth...
متن کاملMean Field Approximation of the Policy Iteration Algorithm for Graph-Based Markov Decision Processes
In this article, we consider a compact representation of multidimensional Markov Decision Processes based on Graphs (GMDP). The states and actions of a GMDP are multidimensional and attached to the vertices of a graph allowing the representation of local dynamics and rewards. This approach is in the line of approaches based on Dynamic Bayesian Networks. For policy optimisation, a direct applica...
متن کاملEnergy and Mean-Payoff Parity Markov Decision Processes
We consider Markov Decision Processes (MDPs) with mean-payoff parity and energy parity objectives. In system design, the parity objective is used to encode ω-regular specifications, and the mean-payoff and energy objectives can be used to model quantitative resource constraints. The energy condition requires that the resource level never drops below 0, and the mean-payoff condition requires tha...
متن کاملAlgorithmic aspects of mean-variance optimization in Markov decision processes
We consider finite horizon Markov decision processes under performance measures that involve both the mean and the variance of the cumulative reward. We show that either randomized or history-based policies can improve performance. We prove that the complexity of computing a policy that maximizes the mean reward under a variance constraint is NP-hard for some cases, and strongly NP-hard for oth...
متن کاملRisk-Sensitive and Mean Variance Optimality in Markov Decision Processes
In this note, we compare two approaches for handling risk-variability features arising in discrete-time Markov decision processes: models with exponential utility functions and mean variance optimality models. Computational approaches for finding optimal decision with respect to the optimality criteria mentioned above are presented and analytical results showing connections between the above op...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Applied Mathematics and Optimization
سال: 2023
ISSN: ['0095-4616', '1432-0606']
DOI: https://doi.org/10.1007/s00245-023-09985-1